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On Convergence of Bayes Factor in Stochastic Differential 

Equations: Part II 

Trisha Maitra and Sourabh Bhattacharya* 

Abstract 

The problem of model selection in the context of a system of stochastic differential equations 
(SDE's) has not been touched upon in the literature. Indeed, properties of Bayes factors have not 
been studied even in single SDE based model comparison problems. 

In this article, we first develop an asymptotic theory of Bayes factors when two SDE’s are 
compared, assuming the time domain expands. Using this we then develop an asymptotic theory 
of Bayes factors when systems of SDE’s are compared, assuming that the number of equations 
in each system, as well as the time domain, increase indefinitely. Our asymptotic theory covers 
situations when the observed processes associated with the SDE’s are independently and identically 
distributed (iid), as well as when they are independently but not identically distributed (non-wd). 
Quite importantly, we allow inclusion of available time-dependent covariate information into each 
SDE through a multiplicative factor of the drift function in a random effects set-up; different initial 
values for the SDE’s are also permitted. 

Thus, our general model-selection framework includes simultaneously the variable selection 
problem associated with time-varying covariates, as well as choice of the part of the drift func¬ 
tion free of covariates. It is to be noted that given that the underlying process is wholly observed, 
the diffusion coefficient becomes known, and hence is not involved in the model selection problem. 

For both iid and non-iid set-ups we establish almost sure exponential convergence of the Bayes 
factor. As we show, the Bayes factor is inconsistent for comparing individual SDE’s, in the sense 
that the log-Bayes factor converges only in expectation, while the relevant variance does not con¬ 
verge to zero. Nevertheless, it has been possible to exploit this result to establish almost sure expo¬ 
nential convergence of the Bayes factor when, in addition, the number of individuals are also allowed 
to increase indefinitely. 

We carry out simulated and real data analyses to demonstrate that Bayes factor is a suitable can¬ 
didate for covariate selection in our SDE models even in non-asymptotic situations. 

Keywords: Bayes factor consistency; Kullback-Leibler divergence; Martingale; Stochastic differ¬ 
ential equations; Time-dependent covariates and random effects; Variable selection. 


1 Introduction 


Stochastic differential equations {SDE’s) have important standing in statistical applications where 
“within” subject variability is caused by some random eomponent varying eontinuously in time. It also 
seems worthwhile to incorporate available time-dependent covariate information into the subjeet-wise 
SDE’s. Apart from the covariates there may also be random effects associated with the individuals, 
which may be useful in modeling variabilities between the individuals. 


SDE-based models with time-dependent covariates are considered in Oravecz et al. (20111, Over- 


gaard et al. (20051, Leander et al. ( 2015| l; moreover, Oravecz et al. ( 2011| ) analyse their covariate-based 
SDE model in the hierarchical Bayesian paradigm. In the literature, random effects SDE models 
without covariates seem to be more popular than those based on covariates. A brief overview of ran¬ 
dom effects SDE models is provided in Delattre et al. (20131 who undertake theoretical and classical 
asymptotic investigation of a class of random effects models based on SDE’s. Speeifically, they model 
the z-th individual by 

dXi{t) = b{Xiit), + a{X,{t))dWi{t), (1.1) 


where, for z = 1,..., n, Xj(0) = x* is the initial value of the stochastic process Xi{t), which is 
assumed to be continuously observed on the time interval [0,Ti]; T* > 0 assumed to be known. The 
function b{x, (p), which is the drift function, is a known, real-valued function on M X (M is the real 
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line and d is the dimension), and the function cr : M i—)• M is the known diffusion coefficient. The 
SDE's given by ( |1.1| ) are driven by independent standard Wiener processes {VTj(•); i = 1,..., n}, and 
{(pi] i = 1,..., n}, which are to be interpreted as the random effect parameters associated with the n 
individuals, which are assumed by Delattre et al. (20131 to be independent of the Brownian motions and 
independently and identically distributed {iid) random variables with some common distribution. For 
the sake of convenience [Delattre et al. (2013|) (see also Maitra and Bhattacharya|(2016c) and Maitra 


and Bhattacharya (20151) assume 6(x, (pi) = (pib{x). Thus, the random effect is a multiplicative factor 


of the drift function. In this work, we generalize this to a random effects SDE set-up consisting of 
time-dependent covariates. 

Note that model selection constitutes an important part of research in both Bayesian and classi¬ 


cal paradigms; see, for example, Dey et al. (2000 1 , Jiang (2007 1 , Claeskens and Hjort (2008 1 , Muller 


et a/.|(|20T^. In the case of SDE-hased mixed effects models as well, model selection constitutes an 


important issue involving the choice of the drift function and selection of the appropriate subset of (time- 
dependent) covariates. Here Bayes factors are expected to play the central role as their effectiveness in 
model selection in complex problems is well-established (see, for example, Kass and Raftery (19951 
for a good account of Bayes factors). Unavailability of closed form expressions in the traditional SDE 
set-ups usually prompt usage of numerical approximations based on Markov chain Monte Carlo or re¬ 
lated criteria such as the Akaike Information Criterion (Akaike (19731) and Bayes Information Criterion 
(Schwarz (1978l). For details, see, for example, Fuchs (2013), lacus ( 2008[ ). But we are not aware of 
any research existing in the literature that attempts to address covariate selection in SDE's. 

We are also not aware of any existing literature on asymptotic investigation of Bayes factors in 
the SDE context although Sivaganesan and Lingham (2002) present some asymptotic investigation of 
intrinsic and fractional Bayes factors in the context of three specific diffusion models. The only inves¬ 
tigation available in this context seems to be that of Maitra and Bhattacharya ( 2016a| ), who model a 
multiplicative part of the drift function using time-varying covariates, and address Bayes factor asymp¬ 
totics in a general set-up consisting of the covariate selection problem as well as selection of the part 
of the drift function independent of the covariates. Different initial values and domains of observations 
pertaining to different individuals, are also considered in their set-up. Assuming that only the number of 
individuals increase without bound, Maitra and Bhattacharya (2016a) establish almost sure exponential 
convergence of Bayes factor in both iid and non-fid situations. Here we recall that the iid set-up is the 
case when there is no covariate associated with the model and when the initial values and the domains of 
observations are the same for every individual. The non-iid set-up, on the other hand, consists of time- 
varying covariates, different initial values and domains of observations; in this work we also consider 
random effects. Thus, unlike the iid case, here the model selection problem also deals with covariate 
selection apart from selection of the part of the drift functions free of the covariates. 

In this article, we prove almost sure exponential convergence of the relevant Bayes factors in both 
iid and non-iid cases, assuming that the number of individuals, as well as the domains of observations, 
increase without bound. Hence, for our current purpose, the asymptotic theory developed by Maitraj 


and Bhattacharya (2016a) when only the number of individuals tends to infinity, is clearly inapplicable. 


Indeed, incorporation of random effects is asymptotically feasible only in our current asymptotic frame¬ 
work; Maitra and Bhattacharya (2016a) elucidate that inclusion of random effects does not make sense 
asymptotically unless the domains of observations are also increased indefinitely. Also, only our current 
asymptotic framework allows different sets of time-dependent covariates for different individuals. 

It is important to remark that the diffusion coefficient becomes known once the continuous process 


is completely observed; see Roberts and Stramer (2001). Hence, following Maitra and Bhattacharya 


( |2016a| ) we assume that the diffusion coefficient is known, and is not involved in the model selection 
problem. 

We begin by establishing an asymptotic theory of Bayes factor for two competing individual SDE’s, 
and then extend the theory to systems of SDE’s. In this context it is important to draw attention to 
the fact that even this relatively simple problem of comparing any two individual SDE’s using Bayes 
factors has not yet been considered in the literature. Our investigation in this simpler case, however. 
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faced with an apparently negative result; the associated Bayes factor failed to be consistent in the sense 
that the relevant variance failed to converge to zero, even though convergence of the log-Bayes factor 
in expectation is ensured. Despite this, we have been able to utilise this result to establish almost sure 
exponential convergence of the Bayes factor when the number of individuals are also allowed to increase 
indefinitely. 

The rest of our article is structured as follows. We begin with formalization of our set-up in Section 
while we provide the necessary assumptions and results in Section In Section we investigate 
the asymptotics of Bayes factor for comparing two individual SDE’s. We illustrate our results with a 
special case in Section]^ In Section]^ we exploit the asymptotic theory of Bayes factors developed for 
comparing individual SDE’s to construct a convergence theory of Bayes factors comparing systems of 
SDE's in both iid and non-nd cases. In Section]^ we carry out two simulation studies to demonstrate 
that Bayes factor yields the correct set of covariates in our SDE models even in non-asymptotic cases, 
and in Section we model a real, company-wise national stock exchange data set, using a system of 
SDE’s, each consisting of a plausible set of covariates, and obtain the best possible sets of covariate 
combinations for the companies, using Bayes factor. We summarize our contributions and provide 
concluding remarks in Section]^ 

2 Formalization of the model selection problem in the SDE set-up when 

n —> oo and T* —> oo for every i 

That the systems considered by us are well-defined and fhe exacf likelihoods are compufable, are guar- 
anfeed by assumption (H2") in Secfion|^ For our purpose we consider fhe filfrafion ,t > 0), where 
= a{Wi{s), s < t). Each process Wi is a {E^, t > 0)-adapfed Brownian motion. 

Here we consider fhe sef-up where, for f = 1, 2,. .., n, 

dXi{t) = (j) (i) it)b (i) {t, Xi{t))dt -h a{t, Xi{t))dWi{t) (2.1) 

and 

dXi{t) = (j) (i){t,Xi{t))dt + a{t,Xi{t))dWi{t), (2.2) 

Pi 

where, Xj(0) = x® is fhe initial value of fhe stochastic process Xi{t), which is assumed to be continu¬ 
ously observed on fhe fime inferval [0, Tj]; T* > 0. We consider ( |2.1| ) as represenfing fhe frue model and 
\2.2) is any ofher model. 

If is useful to remark fhaf we musf analyze fhe same data set with respect to two different models 
for the purpose of model selection. Hence, even though the distribution of the underlying stochastic 
process under the two models are different, for notational convenience we denote the process by Xi (t) 
under both the models, relying on the context and the model-specific paramefers to nafurally clarify fhe 
distinction. 


2.1 Inclusion of time-dependent covariates 

We model cj). ji) (t) for j = 0,1, and i = 1,..., n, as 

= ^Oj + giiziiit)) + g 2 izi 2 {t)) -h • • • -h Cpjgpizipit)), (2.3) 

where Zi{t) = z^t), ..., Zip{t)) is fhe sef of available covariafe information corresponding to 

fhe i-fh individual, depending upon fime t. Following Maifra and Bhatfacharya (2016ai we assume Zi{t) 
is continuous in t, Zii{t) G Zi where Zi is compact and gi : Zi ^ R is continuous, for / = 1,... ,p. 
We let Z = Zi X ■ ■ ■ X Zp, and 3 = {z{t) G Z : t £ [0, oo) such that z{t) is continuous in t}. Hence, 
Zi G Z for all i. 
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2.2 The random effects set-up 


In (|2.ll), = 


Ai) Ai) 

( fA^A^^ — ( /qW )t(*) 

yPi ) SOI) sii) ■ • •) ?pi j — yPi )Ci 


■ • ’^po) ~ stands for the true parameters, and 0^*^ = 


( 2 ) 

are the parameters associated with (2.2 1 . Let 0)' G 0 = iB x F 


for all i, where both and F are compact spaces. We also assume that for ^ = 1, 2,..., n, 


e 


(i) iid 


~ TT, 


where vr is some specified distribution on 0. 

(i) 

Hence, the above describes a random effects set-up. Observe that if = 0 for I = 1,... ,p, and 
for z = 1,..., n, then it reduces to the random effects model of Delattre et al. (20131, showing that the 
latter is a special case of our model. 

As is well-known, even though the term “prior” is not appropiate for the random effects coefficients, 
operationally there is no difference between a prior and a distribution for random effects in the Bayesian 
paradigm. Somewhat abusing the terminology, we continue to refer to the distribution of the iid random 
effects coeffcients, tt, as the relevant prior. 


2.3 Covariate and drift function selection 


The key difference between our current model selection idea and that of Maitra and Bhattacharya (2016a I 
is that here, for every individual, there is an independent model selection problem. In other words. 


for each i, one needs to choose between 0 q^ and This involves selection of perhaps different 

(i) 

sets of covariates for different i with respect to the coefficients \ and different drift functions b (i). 

^3 

fi) (i) 

Obviously, the dimensions of and are allowed to differ for each i; likewise, for every i, the 
dimensions of (3q ' and ' may be different as well. Thus, from this perspective, our current model 
selection framework appears to be more general compared to that of Maitra and Bhattacharya ( |2016a 1 , 
who consider the same set of parameters and Pj for all the individuals, allowing only a fixed set of 
covariates for every subject. 


(0 


2.4 Form of the Bayes factor in our set-up 

For j = 0,1, we first define fhe following quantifies: 




/ 


cr2(s,Xi(s)) 


dXiis), 




a'^{s,Xi{s)) 


ds 


(2.4) 


for j = 0,1 and i = 1,... ,n. 

Lef Cxi denofe fhe space of real continuous functions {x{t),t G [0, Tj]) defined on [0, Tj], endowed 
wifh fhe u-field Cxi associafed wifh fhe topology of uniform conver genc e on [ 0, Tj ]. We consider fhe 
disfribufion on (67^,6^) of {Xi{t)A G [0,Tj]) given by (2.1 1 and (2.2i for j = 0,1. We 
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choose fhe dominafing measure Pi as fhe disfribufion of (|2.1|) and (2.21 wifh nuf 


mff. So, for j = 0,1, 


dP: 




3 


dPi 


= fi i^i) = exp ( 


’ 1 




(2.5) 


where /. (o {Xp denofes fhe frue densify and /. (o {Xp sfands for fhe ofher densify associafed wifh fhe 

Z jt/Q Z 5 (7^ 

modeled SDE. 

For each z = 1,..., n, lelling Xi^a,b denofe fhe z-fh process observed on [a, b] for any 0 < a < b < 
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oo. 


^x^,Ti,Zi 




hMo 


IT (0^) 


( 2 . 6 ) 


denotes the Bayes factor associated with the z-th equation of the above two systems of equations. As¬ 
suming that the SDE's (|2.1|) and (|2.2|) are independent for i = 1,... ,n, 


2 = 1 


is the Bayes factor comparing the entire systems of SDE’s ( |2.1[ ) and (2.2 1 . 

Comparisons between a collection of different models using Bayes factor, none of which may be the 
true model, is expected to favour that model which minimizes the Kullback-Leibler divergence from the 
true model. 


2.5 The iid and the won-iid cases 


We are interested in studying the properties of In,Ti,...,T„ in both iid and non-iid cases when n ^ oo and 
Ti —>• cx). In the iid set-up, we assume that x'^ = x,Ti = T and 0^*^ = \ > for i = 1,..., n and 

j = 0,1. In the non-ud case we relax these assumptions. However, for simplicity, we assume Ti = T 
for each i, even in the non-iid set-up, so that in our asymptotic framework we study convergence of 


in,T = X[l. 

as re — OO and T —)> oo, where li^T = Ix^ TZi- 


i.T, 


(2.7) 


2=1 


2.6 A key relation between U. (*) and V. (o in the context of model selection using Bayes 
factors 

An useful relation between U. and V. „(i) which we will often make use of in this paper is as follows. 

1,6 ' i,6\ ^ ^ ^ 

’ d ’ d 


fTi (t)^Ai){s)h (i) {Xi{s)) 


JO 

f 

f 


cTi (t>^Ai){s)b (i) {Xi{s)) 

’ ' ( 7 ^ {X\s)) - {Xi{s))ds + (T{Xi{s))dWi{s) 

fTi </*• Ai)is)4>. ji){s)b (i) {Xi{s))b (i) {Xi(s)) pTi ‘t’j A'-){s)b {i) (Xj(s)) 

° -ds+i .5;.. — dWi{s) 


{X,{s)) 


a{X,{s)) 


fTi Ai)i^)bg(i) {Xi{s)) 

= Vi - dWiis), 


i O' ' O' 

’ 0 ’^j JQ 


a{Xi{s)) 


(2.8) 


with 


fTi 4’^ ^(■‘)is)b^(i) {Xi(s))b^(i) (Aj(s)) 

= i ^^- 


<72 (X,{«)) 


ds. 


(2.9) 


Note that V. m = V. m m and V. ^ = V. m _(i). Also note that, for j = 0,1, for each i, 

1,Oq l,Og 1,0, 1,0, ,0, 






(T(Ai(s)) 


dWi{s) 


= 0 , 


( 2 . 10 ) 
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so that Eu) 

t'n 




= 


V. 


(i) ad) 




3 Requisite assumptions and results for the asymptotic theory of Bayes 
factor when n —> oo and T —> oo 

All our following assumptions and results are true for each i, in particular true for each ^ ^and 

consequently for U. (o, V. (o, V. (o (o. For the sake of notational simplicity we provide all the as- 

^'>“0 

sumptions and results without mentioning i at every stage. We make the following assumptions: 

(HI") The parameter space 0 = 55 x F such that F and are compact. 

(H2") For j = 0, 1, given any s, f3j, •), a{s, ■) are on M; we also assume that b'^ {s,x) < 

A'i(l + x^ + ||/3j|P) and < iT 2 (l + 3^^) for all s € [0, T], x G M, for some Ki,K 2 > 0. By 
(Hi") it follows as before that for s G [0, T], 6^ (s, x) < iF(l + x^) and a‘^{s, x) < K{I + x^) 
for all X G M, for some K > 0. 


Because of (H2") it follows from Theorem 4.4 of Mao (20111, page 61, that for all T > 0, and any 

k>2, 


E f sup^^ |X*(s)n < (l + 3^-^E\Xi{0)\^'^ exp , 


where 


t }= - (18iT)2 


Ta -p 




2{k-l) 


Specifically, for any A: > 2, we can write, as T —)■ 00 , 


E ^ sup |A:(s)|^^ = o (^exp . 


We further assume the following conditions. 

(H3") (s, x) is continuous in {x,f3j). 

(H4") For s G [0, T] and j = 0,1, and satisfy the following: 




(1 + x2 + ||/3.||2) blXs,x) 


< 


< Kj{(3j) + 


Cj + exp (T^) cr^{s, x) 

M/3. (1 + x^ + ||/3j|P) 


dj + exp (r^) 


and 


, ^/3oA (1 + + ll/^of + Pif) ^ hiis,x)b^o(^,x) 

^(PO’Pl) “I ^ 


< k(/3o, Pi) + 


c + exp(T5) (t‘^{s,x) 

(1 + x^ + ||/3o|p + ||/3i|p) 

d + exp(r^) ’ 


(3.1) 


(3.2) 


(3.3) 


(3.4) 


where 0 < Cj,dj, c,d<oo, are some constants; Kj{Pj) are positive, continuous functions of Pj-, 
k{Pq,Pi) is a continuous function of (/^g, Pp', Mp. are continuous in Pj, for j = 0,1, and 
continuous in (/^g, pp. 
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(H5") (i) We assume that ^ = ZixZ 2 X---xZpis the space of the covariates where Zi is compact 
for / = 1,... ,p, and for every f > 0, Zi{t) = {zii{t), Zi2{t)^... ,Zip{t)) € Z for i = 1,... ,n. 
Also, we assume that Zi{t) are continuous in t for every i, so that 3, for every i. 

(ii) For j = 0,1, and for f > 0, we assume that the vector of covariates Zi{t) is related to the Ath 
SDE of the j-th model via 

p 

' ^ ^ 1=1 


where, for Z = 1,... ,p, gi : Zi —)■ M is continuous. Notationally, when reference to the Ath 
individual is self-explanatory, we shall denote the function ^ij9i 

(hi) For ( = 1, 2,... ,p, for i = 1,..., n, and for f > 0, 


1 " 


2=1 


(3.5) 


and 


1 "" 
n 


2=1 

as n —)■ oo, where {ci{t) : f > 0} are real constants for ( = 1,...,p. 
(iv) For I = 1, 2,..., p, and for i = 1, ..., n, 

1 

9i{zii{s))ds = 4 ^^ 


and 


where c^p and are real constants. 


1 

f Jo 9i{zii{s))gmizim{s))ds = 41, 


(3.6) 


(3.7) 


(3.8) 


Remark 1 Observe that although (H4") is seemingly restrictive in the sense that the ratios 




^{s^x) 


and 


{sjx)^Pq(s,x) approximately independent of the underlying stochastic process, assumption (H5") 
attempts to compensate for the restrictions by providing a rich structure to consisting of covariate 
information varying continuously with time. Hence, assumption (H4") need not be viewed as restrictive. 


Maitra and Bhattacharya (2016a I argue that (3.51 and (3.61 hold if one assumes that for i = 1,..., n. 


and I = 1,... ,p, the covariates zu are observed realizations of stochastic processes that are iid for i = 
1,..., n, for ain = 1,... ,p, and that for I / m, the processes generating zu and Zim are independent. 
In other words, although we assume the covariates to be non-random, in essence, it may be assumed 
9 i{zii{t)) and gm{zim{t)) are uncorrelated for I m. 

In order that (H5") (iv) holds, one needs to further assume that the relevant stochastic processes 
converge to appropriate stationary distributions. For example, zu (t) may be realizations of Markov 
processes which are irreducible (with respect to some appropriate measure), aperiodic, positive recurrent 
and possses invariant distributions; see, for example, Kontoyiannis and Meyn (2003|l. 
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It follows from (H5") (iv), that, 


1 


lim — / (p.^(^){zi{s))ds 


T^oo T 


^Oj 


W.(i) 

ij ^ii 


1=1 


= (say), 


(3.9) 


and 


1 

lim - / (jP: ^{zi{s))ds 
T^oo T Jq 


-{{S’r+2««E«gM;> + EE«g>4 


(i)fW .(2) 


''ilm 


l=l 


l=\ m=l 


= (say), 


(3.10) 


lim - / 4> ii){zi{s))4> (i){zi{s))ds 

T^cx) I Jq *,?o 

— S00?01 ?00 Z^?Z1 '-ii ?01 Z^^IO Hi ZmlHlm 

1=1 1=1 1=1 m=l 

= (say)- 

’Si 

When z is clear from the context, we shall often use the notations and c 

Si Si SO’SI 


(3.11) 


Note that, ( |3.9| ), (3.101 and ( |3.11| ) are limits of expectations with respect to the uniform distribution 
on [0, r]. Hence, by the Cauchy-Schwartz inequality it follows that 


If, < 

so’Si 


t€- 


(3.12) 


The following lemmas will be useful in our proceedings. The proofs of these lemmas are provided 
in sections S-1, S-2 and S-3 of the supplement. 

Lemma 2 The limits ff< ff ^nd tire continuous in 






Lemma 3 Assume (HI")-(HS"). Then, the following hold: 


(i) Ee^ j ^ 4>f^Kj{f3j);j = 0,1. 

(ii) Eg^ 


(in) Eg^ 


T 

^eo,ei,T 

T 

Ugo,T 




i^oiPo) 


(iv) Eg^ ( ^ 


(v) 

(vi) 

(vii) 

[via) 


1 hji^)hjis,x{s)) 

fi a(5,X(s)) 

^ ^ 4 ^o(/ 3 o), 


dWis)^0-j = 0,l, 


"So 

(ix) 4o4i^(^o>/3i), 


(3.13) 

(3.14) 

(3.15) 

(3.16) 

(3.17) 

(3.18) 

(3.19) 

(3.20) 

(3.21) 


In the above, “ —4 ” denotes convergence “almost surely” as T ^ oo with respect to X (under 6 q), 
and the expectations are also with respect to X (under Oq). 

Lemma 4 Assume (HI") - (H5"). Then, the following holds: 

4o4i^(^o>/3i) < ^ \//«o(/3o)ki(/3i)- (3.22) 

4 Convergence of Bayes factor with respect to time when two individual 
SDE’s are compared 


From the system of SDE's defined by p.l| ) and (2.2) we now consider the f-th individual only. To 
avoid notational complexity we denote Xi simply by X. Consequently, (t) and Tj will be denoted 
by (j)^ (t) and T, respectively. In connection with the i-th individual we consider the following two 
SDE’s: 

dX(t) = {t)bf3^ (t, X{t))dt + a{t, X{t))dW(t) 


and 


dX(t) = (l)^_^{t)bp_^{t,X(t))dt + a(t, X{t))dW{t). 
For any t G [0, T], for j = 0,1, let 


(4.1) 

(4.2) 


Ue„t = 


V) 


Oo,0j ,t 


'^hAs)bp^is,Xis)) ^ rtfl^{s)blis,X{s)) 

I (^‘^{s,X{s)) ^ Jq a‘^{s,X{s)) 

hjis)^l3,is,Xis))(l)^ois)bi3^is,X(s)) 


ds, 


/o 

Note that V0o,t = ^0o,0o,t and Vg^^t = V0u0i,t- We also let 

f 0 j,tiXo,t) = exp (ug.^t - 


(4.3) 


(4.4) 
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Here we are interested in asymptotic properties of the Bayes factor, given by 


j f fei,T{Xo,T) . 


(4.5) 


as T —)■ oo. 

For our purpose, let us define, for any h > 0, 

Uqj t t^h — 2( v{ \\ dX\s)^ ^Oj,t,t+h — 2{ v( 

' h o\s,X{s)) ^ Jt a\s,X{s)) 


^0O,0j,t,t+h 


rt+h (s)b» (s, X(s))(?!)^^ {s)b/ 3 ^ (s, X (s)) 


ds. 


It cr^is,X{s)) 

Observe, as before, that V 0 ^^t,t+h = Veo,eo,t,t+h and Ve^^t^t+h = V 0 ^^ 0 ^^t,t+h- We let 


(4.6) 


f0j,t,t+h{Xt,t+h) = exp (^U0^^t,t+h - 


(4.7) 


where, for any 0 < a < b < oo, Xa^b denotes a path of the process X from a to b. For any f > 0 and 
/i > 0, we define 

1^ / r p \ 771 ^ ^ I'0O^t^t-\-h 

''^\J0o,t,t+hi J0i,t,t+h) — 7^00 " 


log 


f0l,t,t+h_ 


(4.8) 


where E 0 q = Ejg^^. Nofe fhaf fhis is nol fhe familiar Kullback-Leibler divergence measure, since /egy, 
wifh respecf fo which fhe expecfafion is faken, is nof fhe same as f 0 Q^t,t+h- In facf, since in our case, for 
i = 0,1, 


f0j,t,t+h {^t,t+h) = exp (^0^^t,t+h - 


= exp {U0 t+h - U0 t) - 


{y0j,t+h — y0A,t) 


(4.9) 


if follows fhaf 

I^if0o,t,t+h, f0i,t,t+h) = IC{f0o,t+h, f0i,t+h) - IC{f0o,t, f0i,t), (4.10) 

where X{f 0 g^t+h, f 0 ut+h) and X{f 0 g^t, / 0 i,t) are proper Kullback-Leibler divergences befween f 0 g^t+h, 
f 0 ^^t+h, and respecfively. We now define 


^i(/0o>/0i) = lim 
^^0 


^(.f0O,t,t+hi f0i,t,t+h) 

h 


= - -^^eo{y0o,0i,t) + 


(4.11) 


The expression (|4.11|) easily follows using (4.9), fhe relation (2.8) and (2.10). 


4.1 Pseudo Kullback-Leibler (d) property 

We make fhe following assumpfion: 

(H6") For a fixed 6 > 0, fhe prior vr safisfies 

TT (01 E 0 : inf }C[{f0„f0,) >(()=!. ( 4 . 12 ) 
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Let us define ^ 

JC'ife^feMt- (4-13) 

T^oo I Jq 

We assume the following: 

(H7") Given 6 associated with (H6")^ for any c > 0, the prior vr satisfies 

vr (01 G © : 5 < }C°°{fg„fe,) < 6 + c) > 0. (4.14) 


We refer to property (H7") as the pseudo Kullback-Leibler (5) property of the prior vr. Note that, (4.11 1 , 
(|3.13|) and (|3.14|) imply 


m _ ^(2) 

1 

> - 
“ 2 

> 0 , 


(4.15) 


(t>f^ 


by Lemma 1^ Provided that (4.12i holds and the prior vr is dominated by the Lebesgue measure, the 
pseudo Kullback-Leibler (li) property holds because of continuity of (4.15 1 in 0i = (/3i, ^i) ensured by 
Lemma 121 


4.2 Q* property 

For t > 0, let J-t be the ir-algebra generated by 3f(0) and the history of the process upto (and including) 
time t, and let 7ri(0i) = 7r(0i| J^i) be the posterior of 0i given Also, let 




exp 


/6»ie© 


f 

f t—h 


1 

a^{s,X{s)) 


= E 


exp 


f 

It-h 


hii^)b/3As,Xis)) 1 cPl{s)bl^{s,X{s)) 

a^s,X{s)) 2 7t-„ ct2(s,X(.)) 


ds TTt-hidOi) 


Xt-h 


(4.16) 


be the posterior predictive density. Further, for any Borel set A such that 7r(A) > 0, let 


rt 


exp 




1 


ft r/)2 


= E 

where 


OiGA 

exp 


Jt-h o-^(s,A(s)) 

E (p^^{s)b^^{s,X{s)) 

It-h cr‘^{s,X{s)) 


dX{s) - - 


4>lS^)b^^Ss,X{s)) 


1 E 

dX{s) - - 


2 Jt-h a^{s,X{s)) 

J'lS^)bl^{s,X{s)) 


ds TTt-hMd^l) 


TTt,A{d9l) = 


2 Jt-h cr2(s,X(s)) 

/A(0i)vri(d0i) 


ds 


Xt-h, A 


, (4.17) 


J^TTtidOl) 

is the posterior restricted to the set A. We assume the following: 
(H8") 


liminf E, 


00 


X 


t (/flo) fAt 


(S) 


>S, 


(4.18) 
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whenever 


At{6) = {eiGe:IC'ifo,Je,)>5] 


(4.19) 


We refer to (H8") as the Q* property. 


4.3 Main result on convergence of Bayes factor when two individual SDE’s are com¬ 
pared 


Let Iq = 1 and for f > 0, let us define, analogous to ( |4.5| ), 


It = 


feoA^o,t, 


TT^dOi 


(4.20) 


The following lemma, proved in section S-4 of the supplement, will prove useful in proving our main 
theorem on convergence of Bayes factor. 


Lemma 5 


^00 


log 


It—h 


^t-h 


= Eoa 


log 




f0Q,t—hA^i—h,t) 


Et-h 




(4.21) 


We make the following further assumption: 


(H9") For any f > 0, ^ yf 9 o,t,t+h„, ft,t+h„j converges in expectation for all sequences {hn} converging 

to zero as n —oo, with limit independent of {hn}- We refer to the limiting process as 1C[. In 
other words, 

E \ feo,t,t+hnJ ft,t+hn I \ /- / AN 

= E(jC[(^f0„fjy (4.22) 


lim E 

n—^oo 


hr, 


for any sequence {hn} such that —)• 0 as n —)> oo. 

Because of Lemma it follows from (H9")^ using uniform integrability (which is easily seen to hold 


because of (Hi") - (H4") and (3.2 1 ), that Jhn{t) = 


^Og It+h„-log It 

hn 


converges in expectation for all 
sequences {hn} converging to zero as n —)■ oo, with limit independent of {hn}- We refer to the limiting 
process as J}- That is, for any t > 0, 


Now, 

ft,t+h = E 


lim E(ASt)) = E {J}) = ^E[\ogIt)- 
n^oo ^ ' at 


exp (r - 1 r 

aAs,X{s)) 2j, aAs,X{s)) 


(4.23) 


Et 


so that Lemma [^implies 

It+h 


log- 


It 


Et 


= E 


log 


ft,t+h {Xt^t+h ) 


foo,t,t+h 


Et 


0Q^t,t+hT ft,t+h) • 


It follows, using (H9"), that 


E 


{4\Et) = -}C} [feoJ)- 


(4.24) 


(4.25) 


(4.26) 


Note that for all sequences {hn} such that ^ 0 as n —)• oo, Jh„{t) = h" measurable 

with respect to Et* = a {X{s) : 0 < s < t*), for all f* > f > 0. Hence, E {J}\Et) / J{- 
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Regarding convergence of It, we are now ready to present our main theorem whose proof is pro¬ 
vided in section S-5 of the supplement. 

Theorem 6 Assume the SDE set-up and conditions (HI") - (H9"). Then 

^Ee,{\oglT)^-S, (4.27) 

but 

^Vare,{\oglT) = 0{l), (4.28) 

as T ^ oo. 

fe . T(Xt) 

Corollary 7 Forj = 1,2, let RjTiOj) = where 0 i and 62 are two different finite sets of 

parameters, perhaps with different dimensionalities, associated with the two models to be compared. 
For j = 1, 2, let 

IjT = J RjT{0j)Tj{d6j), 

where itj is the prior on 6j. Let Bt = hr/hr denote the Bayes factor for comparing the two models 
associated with tti and tt 2 - Assume that both the models satisfy (HI") - (H9"), and have the pseudo 
Kullback-Leibler property with (5 = (5i and 5 = 62 respectively. Then 

Eqq f ^ log ^ (^2 - (Ji, (4.29) 


as T 


00 . 


5 Illustration of our asymptotic result for comparing two individual SDE^s 
with a special case 


Let the parameter space 0 be compact, so that (HI") holds. Let bjs. and a satisfy (H2") such that 


a(s, x) 


= VjiPj)', j = 0 , 1 , 


(5.1) 


so that 


bi3^{s,x)bi3^(s,x) 
Cj2(s, x) 


Vo{(3o)vi{(3i)- 


(5.2) 


In the above, rjiifdi) is continuous in /3^. Hence, (H3") and (H4") are satisfied. We assume that the 
relevant covariates and the functions gi are such that (H5") holds. 
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Letting Kj(/3j) = and k(/3o,/ 3i) = T/(/3o)r?(/3i), equations (5.1 1 - (5.2i entail 


yej,t = Kj{Pj) (/>|.(s)ds; 

/o 


V'0o,0i,i = k(/3o,/3i) / 


10 

Ue^^t = KoiPo) [ (Pla^s)ds + r]o{PQ) f (j)^^{s)dW{s)- 
Jo Jo 

rt 


Ueo,ei,t = K 


(/3o>/3i) [ + miPi) [ hA^)dW{s); 

Jo Jo 


rt^h 


Ve^,t,t+h = i^j{Pj) 4'fis)ds-, 


ft 


rt+h 


V0o^0^^t,t+h = i^{(3o, Pi) (j)^^{s)(j)^^{s)ds-, 

/ t+h rt+h 

(j)l^{s)ds + rjoiPo) 

pt-\-h pt-\-h 

U 0 ^^ 0 ^^t,t+h = KiPo,Pi) (p^^{s)(p^^{s)ds + T]i{pp (j)^^{s)dW{s). 

Jt Jt 


Due to (5.3 1 and (5.4), we obtain 


KAho.feO = lim-- 

= “ ^-E0o(^0o,0i.t) + 2 ^^^oiyOij) 

4>l (t) 4>i (t) 

= —ko(/3o) - ^i(/3i) 

1 


= {ho(^)doiPo) - hii^)dliPl)) ■ 


(5.3) 

(5.4) 

(5.5) 

(5.6) 

(5.7) 

(5.8) 

(5.9) 
(5.10) 


(5.11) 


Note that 


mf)Ct{f0^,f0,) > {hoi^)do{Po) - hii^)vi{Pi)y 

= 6 . 


(5.12) 


It follows from (5.121 that (H6") holds for any prior on di. 
Also, it follows directly from ( |5.11| ), that 


^(2) 


^~(/eo>/0i) = -^^oiPo) - ^fl^RiPo,Pi) + ^Aci(/3i), 


(5.13) 


which is continuous in (/3^, ^^), due to the continuity assumption of r?i(/3 i) in / 3i and Lemma]^ which 


5.13 1 is a continuous function 


"( 2 ) ~( 2 ) 

guarantees continuity of and in ^i. Since the right-most side of (^ 
of 01, it follows that (H7") is clearly satisfied if the prior vr is dominated by the Lebesgue measure. 
We now verify the Q* property (H8")- Recall that fAt{S) = ft,t+h, since vr {At{ 6 )) = 1. Since 


ft,t+h{J^t,t+h) 'P sup f 0 -^^t^t+h{J^t,t+h) — f 0 .ix,.,i,')tt+h('^^J+h)i 

6»ie^t(<5) ^ ’ 


(5.14) 
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where G At{6), is the maximizer of f0^^t,t+h in the compact set At{6). Hence, 


/C 


(^fOo,t,t+h^ fAt{5) 

= Ee^ {log f0^^t,t+h{Xt^t+h)) - E0^ (log fAt{s){Xt, 
> E0 q {log f0Q^t,t+h{l^t,t+h)) — E0 q 
= E0^ flog ^ feo,t,t+h{Xt,t+h ) —_ 




= E. 


E 


log 


f0O,t,t+h{^t,t+h) 


ei{Xt^t+h)\eo^Xt^t+h\ei(Xt,t+H)=r,0o f.. ^ , (X,,>h) 

\ J {ei(Xt,t+h)=r},t,t+h\^t^t+h) 


Et 


'E ^ 0 i(Xt,t+^)| 0 o^g“f( 5 ) ^ if0Q,t,t+h^ frU+h) 


= E^ 


i{Xt^t+h)\eo^ {foo,t,t+h, fT*,t,t+h) 


E (/0o,t,i+/i) fT^,t,t+h} 

>-5, 

where = argmin IC {f0o,t,t+h, fT,t,t+h) G At{6). Hence, the Q* property is satisfied. 

TeAt(5) 


(5.15) 


To see that (H9") holds, first observe that it follows from the proof of Lemma 5 that = 


It+h _ ft,t+h 


which implies 




It /eg.t.t + h’ 


E (f0(i^t,t+hi ft,t+h^ E0 q (log f0Q^t,t+h{Et^t+h) lo^ ft,t+h{Et^t+h)\Ei(j 


(5.16) 


Now, 


E0O {log f0Q^t,t+h{Et,t+h)\Et) = 


^o(/3o) i2 /„\J„ ^o(/3o), ,2 (*, 


ho^s)ds = —-— h(j)^^{s*{h)), 


(5.17) 


by the mean value theorem for integrals, where s*{h) —)■ f, as /i —)■ 0. Hence, using continuity of 
in f, we obtain 


hm i E 0 , {logf 0 ,,t,t+h{Xt,t+h)\Et) = 
h^O tl 


lim (jtj {s*{h)) = 


(5.18) 


To deal with (log ft,t+h{Xt,t+h)\Et^ , note that for any Xt,t+h, by the mean value theorem for 

integrals, 

ft,t+h{Xt,t+h) = f0{Xo,t,Xt,t+h),t,t+h(^I^I+>^'l^ 

where 0(Xo,t, Xt^t+h) G 0. It is clear that 0{Xo^t, Xt^t+h) 0{Xo,t, Xt) = 0(7fo,t) almost surely, as 
/i —0. Hence, 


E0q (log ft,t+h{Xt,t+h)\Et 


- ^e{Xo,t,Xt.t+h)\eo^Xt,t+h\Hxo,t,Xt,t+h)=a,eo 


f{HXo^t,Xt,t+h)=ct},t,t+h^^t>t+h)\Xt) , 
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where 


00 ft^t+h{Xt,t+h)\J^t^ 


E, 


= Eoq 


0 j [0(^xo,uXt,t+h)}u+h 


{Xt,t+h)\Et) 


ct-\-h 


= £^00 p(/3o>/3i(^o,i,^M+/j)) / 




rt+h 


i(x.,,x,,„)(»)* 




2 *^41(^0,t A,t+^)' 


(5.19) 


where si(/i), S 2 (/i) G [t, i + /i], associated with the mean value theorem for integrals. Hence, si{h) t 
and S 2 {h) —)■ t, almost surely as /i —)■ 0. 

Continuity of it(-), ki(-) and the results 6{XQ^t,Xt^t+h) —^ si{h) —^ t, S2{h) —> t, almost 

surely, as /i —)■ 0, in conjunction with the dominated convergence theorem exploiting boundedness of 
the functions (j)^ , R and Kj, imply, using continuity of cp^ {t) in t, that 


lim i Ee^ [log ft^t+h{Xt,t+h)\Et 

h^O li ' 


= ho ^^^\{Xo,t) HXo,t)) 






In other words, the limit of (5.161 exists and is unique as /i —)• 0. Now, equations (5.17 1 and (5.191 along 
with dominated convergence theorem imply that (H9") holds. 

Thus, all the assumptions required for Theoremand Corollary are satisfied. Hence, both (4.271 
and \A.19\ hold. 


6 Asymptotic convergence of Bayes factor in the SDE set-up with respect 
to number of individuals and time 

6.1 Convergence of Bayes factor in the iid set-up 

Although Theorem 1^ fails to ensure consistency of the Bayes factor as T —)• oo in the sense that the 
relevant variance is asymptotically positive, the theorem is useful to prove almost sure consistency when 
T —)• oo as well as n —)■ oo, for both iid and non-fid situations. Theorem formalizes this for the 


iid set-up, while Theorem 12 establishes almost sure consistency of the Bayes factor in the non-iid 


situation. Proofs of these theorems are contained in section S-6 and S-9 respectively in the supplement. 
Theorem 8 Assume the iid set-up; also assume that conditions (HI") - (H9") hold for each SDE in 


the systems {2.1) and (2.2). Then 


■^\ogin,T 


-< 5 , 


(6.1) 


almost surely, as n ^ oo and T —)• oo. 
The following corollary is obvious. 


Corollary 9 For j = 1,2, and i = 1,... ,n, let = f \ ’ ’ (x. o t) ’ h 

and 02 ^ are two different finite sets of parameters, perhaps with different dimensionalities, associated 


4(») 


}{'0 
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with the two systems {2.1) and {2.2) to be compared. For j = 1,2, let 


h,n,T = n 

i=l 

where is the prior on for i = 1,2,.... Let Bn,T = A,n,r/^ 2 ,n,T denote the Bayes factor for 
comparing the two models associated with tti and ti 2 - Assume the iid case and suppose that both the 
systems satisfy (HI”) - (H9”), and have the pseudo Kullback-Leibler property with <5 = 5i and 5 = 82 
respectively. Then 

— log Bn,T 82 - Si, 

almost surely, as n ^ 00 and T —)• cx). 


RjMe)' 


»' 


'Kj{d 6 ) 


{if 


6.2 Convergence of Bayes factor in the non-iid set-up 


We now relax the assumptions x® = x and = ^ 3 ^ = • • • 

we are now in a non-iid situation where the pro cesse s i = 1 ,. 

identically distributed. As mentioned in Section 2^ we assume that 6 
2 : G 3 = {z{t) G Z : t £ [0, 00 )}, it holds, due to Theorem|^ that 


= = 0 for j = 0,1. Thus, 

.., n, are independently, but not 
(*) jjj set-up, for each 


1 

f 


Eqo (log4,T,z) 


- 8 {x,z), 


( 6 . 2 ) 


as T — 00 , where 8 (x, z) depends upon the initial value x G X and the set of time-dependent covariates 
2 : G 3- The following lemma shows that 8 (x, z) is continuous in (x, z) G X x 3- 

Lemma 10 Assume the conditions of Theorem^ Then, 8 (x, z) is continuous in (x, z) G X x 3- 

Now consider the following limit: 


1 

8 °° = lim — (5(x®, Zj). 

n^oo Tl 

i=\ 


(6.3) 


The following lemma shows that the above limit exists for all sequences {(x®, z 


Lemma 11 The limit {6.3) exists for all sequences | (x®, z 0}“3eXx3. 

Proof of these two lemmas are provided in section S-7 and S -8 respectively in the supplement. Now, 
we have the following theorem. 


Theorem 12 Assume the non-iid set-up, and conditions (HI”) - (H9”), for each SDE in the systems 

1 


(2.7 I and (2.21. Then 


nT 


log In,T - 8 °°, 


(6.4) 


almost surely, as T ^ 00 and n ^ 00 . 

We then have the following corollary for the non-nd case. 

Corollary 13 For j = 1,2, and i = 1,... ,n, let ^ ^ ^ where, for each i, 0^' 

Sq ,t,T 

and 02 ^ are two different finite sets of parameters, perhaps with different dimensionalities, associated 
with the two systems {2.1) and {2.2) to be compared. For j = 1, 2, let 


4(*) 


}{'0 


Ij,n,T — 






2 = 1 ' 
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where nj is the prior on 0^*^; i = 1,2,.... Let Bn^T = Ii,n,T / l 2 ,n,T denote the Bayes factor for 
comparing the two models associated with tti and 712 - Assume the non-iid case and suppose that both 
the systems satisfy (HI") - (H9”), and have the pseudo Kullback-Leibler property with 6i = 6ii and 
di = ^21 respectively. Let, for j = 1, 2, 


1 

5f= lim 

n^OQ Ti 

2=1 


Then 

1 

nT 

almost surely, as n ^ 00 and T —)• 00. 


log Bn,T 


-S' 


1 > 


7 Simulation studies 


7.1 Covariate selection when n = 1, T = 5 

We first demonstrate with simulation study the finite sample analogue of Bayes factor analysis associated 
with a single individual, when T —)■ cx). In this regard, we consider modeling a single individual by 

dX{t) = (6 +C 2 Zi{t) + 6 ^ 2 (t) +Uz 3 it )){^5 +C6X(t))dt + crdW{t), (7.1) 

where we fix our diffusion coefficient as u = 20. We consider the initial value X(0) = 0 and the time 
interval [0, T] with T = 5. 

To achieve numerical stability of the marginal likelihood corresponding to data we choose the true 
values of i = 1,..., 6 as follows: N{p,i, 0.001^), where pi N{0, 1). This is not to be 

interpreted as the prior; this is just a means to set the true values of the parameters of the data-generating 
model. 

We assume that the time dependent covariates Zi{t) satisfy the following SDEs 

dzi{t) =(01 + 92 Zi{t))dt + dWi{t) 

dz2{t) = 62 ,dt + dW2{f) 

dzfyt) =hz 3 {t))dt +dWfyt), (7.2) 


where t = 1, 2,3, are independent Wiener processes, and 9. 

ad T / r\ r\ rx-l2 


nd 

r\j 


We obtain the covariates by first simulating 0j 


iV(0,0.012) for t = 1 ^... ^4. 
A^(0,0.01^) for t = 1, • • • ,4, fixing the values. 


and then by simulating the covariates using the SDEs (1.1) by discretizing the time interval [0, 5] into 
500 equispaced time points. In all our applications we have standardized the covariates over time so that 
they have zero means and unit variances. 

Once the covariates are thus obtained, we assume that the data are generated from the (true) model 
where all the covariates are present. For the true values of the parameters, we simulated (^1 ,..., ^g) 
from the prior and treated the obtained values as the true set of parameters 6q. We then generated the 
data using (7.11 by discretizing the time interval [0, 5] into 500 equispaced time points. 

As we have three covariates so we will have 2^ = 8 different models. Denoting a model by the 
presence and absence of the respective covariates, it then is the case that (1,1,1) is the true, data- 
generating model, while (0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), and (1,1,0) are the 
other 7 possible models. 

As per our theory, for a single individual, the Bayes factor is not consistent for increasing time 
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domain. However, we have shown that 


^Ee^iloglT) -6 

as T —)• oo. Thus, the Bayes factor is consistent with respect to the expectation. Our simulation results 
show that this holds even for the time domain [0, 5], where we approximate the expectation with the 
average of 1000 realizations of It associated with as many simulated data sets. 


7.1,1 Case 1: the true parameter set Oq is fixed 
Prior on 6 

We first obtain the maximum likelihood estimator (MLE) of 6 using simulated annealing and then 
consider a normal prior with the MLE as the mean and variance O.S^Ie, where Ig is the identity matrix 
of order 6. 

Form of the Bayes factor 

In this case the related Bayes factor has the form 


It 


f9o,Ti^O,T) 


7r((i0i), 


(7.3) 


whereto = (^o,i, Co, 2 , ?o, 3 , ?o, 4 , ^o, 5 , ^o,6) is the true parameter set and 0i = (?i, 6,6, ^ 4 , ^ 5 , ^e) is 
the unknown set of parameters corresponding to any other model. Table 7.1 describes the results of our 
Bayes factor analyses. It is clear from the 7 values of the table that the correct model (1,1,1) is always 


Table 7.1: Bayes factor results 


Model 

Averaged | log h 

(0,0,0) 

-2.5756029 

(0,0,1) 

-0.913546 

(0,1,1) 

-0.5454860 

(0,1,0) 

-0.763952 

(1,0,0) 

-2.5774163 

(1,0,1) 

-0.9312218 

(1,1,0) 

-0.7628154 


preferred. 


7.1.2 Case 2: the parameter set Oq is random and has the prior distribution vr 

but with variance O.l^Ig- In this 


7.1.1 


As before, we consider the same form of the prior as in Section 
case we calculate marginal likelihood of the 8 possible models, and approximate 


1 

5 


Eoo 



fi,ei,5{Xo^5)Tr{d6i 


for i = 1,..., 8 by averaging over 1000 replications of the data obtained from the true model. Denoting 
its values by £i, Table|7.2|shows that is is the highest, implying consistency of the averaged Bayes factor. 
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Table 7.2: Averages of g x marginal log-likelihood 


Model 


(0,0,0) 

-1.21923 

(0,0,1) 

-0.21428 

(0,1,0) 

1.47992 

(0,1,1) 

2.102966 

(1,0,0) 

-1.222362 

(1,0,1) 

-0.21898 

(1,1,0) 

1.459921 

(1,1,1) 

2.121237 (true model) 


7.2 Bayes factor analysis for n = 15 and T = 5 

In this case we allow our parameter and the covariate sets to vary from individual to individual. We 
consider 15 individuals modeled by 

dXi{t) = + ^2^1 (f) + + C4^3(f))(?5 + ^QXi{t))dt + aidWi{t) (7.4) 


for i = 1, • • • , 15. We fix our diffusion coefficients as cxj+i = <7* + 5 for f = 1 • • • ,14 where ai = 10. 
We consider the initial value (0) = 0 and the interval [0, T], with T = 5. As before, we generated the 
observed data after discretizing the time interval into 500 equispaced time points. Here our covariates 
and the parameter set 0 q = (^o i) '^o 2) ■^o 3> 4) 5) ^ = 1,..., 15, are simulated in a similar way 

as mentioned in Section 17711 

For each of the 15 individuals, the true set of covariate combination is randomly selected. Thus, for 
a given model, there are 15 sets of covariate combinations to be compared with other models consisting 
of 15 different sets of covariate combinations. To decrease computational burden we compare the true 
model with 100 other models consisting of different sets of covariate combinations. 

The Bayes factor corresponding to the j-th covariate combination is given by 


^nT - n 


2=1 ' 




TT ( ef ) de\ 


lO) 


(7.5) 


( 2 ) 

for j = 1, • • • , 100, where n = 15, T = 5 and 0 q ’ is the true parameter set corresponding to the f-th 
individual. 

We obtain the MLE of the 15 parameter sets by simulated annealing. Then we calculate the Bayes 
factor with the prior such that the parameter components are independent normal with means as the re¬ 
spective MLEs and variances 1. In all the cases corresponding to 100 covariate combinations we obtain 
^ logl;^^ < 0 for j = 1, • • • , 100. Thus, Bayes factor indicated the correct covariate combination in 
all the cases considered. We also considered the case when a normal prior is considered for the param¬ 
eters of the true model. In this case with respect to the component-wise independent normal prior with 
individual mean as obtained from simulated annealing and component-wise variance 0.1^, we obtain 


1 

15 X 5 


log 



iX,,o,T)7r{e\ 


(i) 



15 


log n 


V2=l ' 






< 0 , 


(7.6) 

for j = 1, • • • , 100. Indeed, it turned out that log I fi gW (-^i,o,T)vr(0o = 0.4865 

and the maximum of log ^01=1/ f^. is 0.4127. In other words, the 
Bayes factor consistently selects the correct model even in this situation. 
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Real Data for 1st Company 


Real Data for 11th Company 


Real Data lor 12th Company 





Figure 8.1: Some eompany-wise time-series data. 


8 Company-wise data from national stock exchange 

To deal with real data we eollect the stock market data (467 observations during the time range August 
5, 2013, to June 30, 2015) for 15 companies which is available on www.nseindia.com. The nature of 
some company-wise data are shown in Figure [8T] 

Each company-wise data is modeled by various availabe standard financial SDE models with the 
available “fitsde” package in R. After obtaining the BIC (Bayesian Information Criterion) for each 
company corresponding to each available financial model, we find fhaf fhe minimum value of BIC cor¬ 
responds fo fhe CKLS model, given, for process X(t), by 

dX{t) = {9i + 92X{t))dt + 9^X{t)^HW{t). 

As per our fheory we freaf fhe diffusion coefficienl as a fixed quantify. So, affer obfaining fhe esfimafed 
value of fhe coefficienfs by fhe “fifsde” function, we fix fhe values of 9^, and 9^, so fhaf fhe diffusion 
coefficienl becomes fixed. We lef 9^ = A, 9^ = B. 

In fhis CKLS model, we now wish fo include lime varying covariafes. In our work we consider fhe 
“close price” of each company. The slock markel dala is assumed fo be dependenl on IIP general index, 
bank inleresl rale, US dollar exchange rale and on various olher quanfilies. Bui we assume only Ihese 
Ihree quanfilies as possibly fhe mosl imporfanf lime dependenl covariafes. 

Briefly, IIP, fhaf is, index of induslrial produclion, is a measuremenf which represenfs fhe sfalus of 
production in fhe induslrial secfor for a given period of lime compared fo a reference period of time. 
If is one of fhe besl stalisfical dala, which helps us measure fhe level of induslrial acfivily in Indian 
economy. Ifs imporlance lies in fhe facl fhaf low industrial production will result in lower corporate 
sales and profits, which will directly affect stock prices. So a direct impact of weak IIP data is a sudden 
fall in stock prices. 

As the IIP data is purely industrial data, banking sector is not included in it. So, we also consider the 
bank interest rate as another covariate. Note that, higher the bank interest rate, fixed deposils become 
more allraclive and one will preferably deposil money in bank ralher fhan invesl in slock markel. Be¬ 
sides, companies wilh a high amounf of loans in fheir balance sheels would be affected very seriously. 
Inleresl cosl on exisfing debl would go up affecling fheir EPS (Earning per Share) and ulfimalely fhe 
slock prices. Buf during low inleresl rale Ihese companies would stand to gain. Banking sector is likely 
to benefit most due to high interest rates. The Net Interest Margins (it is the difference between the 
interest they earn on the money they lend and the interest they pay to the depositors) for banks is likely 
to increase leading to growth in profits and the stock prices. Hence, it is clear that, the interest rates 
and stock markets are inversely related. As the interest rates go up, stock market activities tend to come 
down. 
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1st Covariate 



2nd Covariate 



3rd Covariate 



Figure 8.2: Covariates. 


Finally, exehange rates directly affect the realized return on an investment portfolio with overseas 
holdings. If one own stock in a foreign company and the local currency goes up, the value of the 
investment also goes up. Foreign investment is also related very much to US dollar exchange rate. 

Hence, we collect the values of the aforementioned time varying covariates during the time range 
August 5, 2013, to June 30, 2015. The pattern of the covariates are displayed in Figure [8l2| 

We denote these three covariates by ci, C2, C3 respectively. Now, our considered SDE models for 
national stock exchange data associated with the 15 companies are the following: 

dx,{t) = {d\ + Biciit) + eic2{t) + + eiXi{t))dt + A^Xi{tf^dWi{t), (s.i) 

for i = 1, • • • , 15. 


8.1 Selection of covariates by Bayes factor 

Among the considered three time varying covariates we now select the best set of covariate combinations 
for the 15 companies among 100 such sets through Bayes factor, computing the log-marginal-likelihoods 
with respect to the normal prior on the parameter set, assuming a priori independence of the parameter 
components with individual means being the corresponding MLE (based on simulated annealing) and 
0.01^ variance (relatively small variance ensured numerical stability of th marginal likelihood). Table 
|8.1 1 provides the sets of covariates for the 15 companies obtained by our Bayes factor analysis. Also 
observe that each of the three covariates occurs about 50% times among the companies, demonstrating 
that overall impact of these on national stock exchange is undeniable. 


9 Summary and discussion 


This article establishes the asymptotic theory of Bayes factors when the models to be compared are sys¬ 
tems of SDE’s consisting of time-dependent covariates and random effects, assuming that the number 
of individuals as well as the domains of observations of the individuals increase indefinitely. Different 
initial values for different SDE's are also allowed. The only instance of related effort in this direction 
is that of Maitra and Bhattacharya ( |2016a i. The main difference of our undertaking with that of Maitra 
and Bhattacharya ( 2016a| ) is that they assumed the domains of observations to be fixed for the individ¬ 
uals, a consequence being that incorporation of random effects in their model was not possible from 
the asymptotic perspective. Moreover, in their case, a single set of covariates was associated with all 
the individuals, but here our random effects set-up allows different sets of time-dependent covariates for 
different individuals. 
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Table 8.1: Company-wise eovariates obtained by Bayes faetor analysis 


Company 

Covariates 

1 

Bank rate 

2 

US dollar exehange rate 

3 

None 

4 

None 

5 

Bank rate and US dollar exehange rate 

6 

Bank rate and US dollar exehange rate 

7 

IIP general index and US dollar exehange rate 

8 

Bank rate 

9 

IIP general index and Bank rate 

10 

IIP general index 

11 

IIP general index. Bank rate and US dollar exehange rate 

12 

IIP general index and Bank rate 

13 

US dollar exehange rate 

14 

IIP general index. Bank rate and US dollar exehange rate 

15 

IIP general index 


To proeeed, we first needed to build an asymptotie theory of Bayes faetors for eomparing two indi¬ 
vidual SDE’s, rather than two systems of SDE’s, as the domain of observation expands. Our results 
in this regard, whieh help formulate our asymptotie theory for eomparing two systems of SDE’s using 
Bayes faetors, are perhaps also of independent interest, being possibly the first ever results in this diree- 
tion of researeh. Although the relevant varianee did not eonverge to zero when two individual SDE’s 
are eompared, we are able to establish almost sure exponential eonvergenee of the Bayes faetor when 
the number of subjeets are allowed to inerease indefinitely. Importantly, our theory eovers both iid and 
non-fid eases. 

Our simulation studies assoeiated with eovariate seleetion demonstrate that Bayes faetor yields eon- 
sistent results even in non-asymptotie situations. Bayes faetor analysis of a real data on eompany-wise 
national stoek exehange also yielded plausible sets of eovariates for the eompanies. 

Note that our eurrent asymptotie Bayes faetor theory remains valid for eomparison between iid and 
non-iid models. For instanee, if the true model eonsititutes an iid system, then /o* = fo = fe^', the rest 
remains the same as the theory for our non-iid setting. The situation is analogous when the other model 
forms an iid system. 
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Supplementary Material 


Throughout, we refer to our main 


manuscript Maitra and Bhattacharya (|2016b) as MB. 


S-1 Proof of Lemma 2 of MB 

Due to compactness of F it follows, using the form of (j)^_ provided in (H5")> that the convergences 
(3.9), (3.10) and (3.11) of MB are uniform over F. The same form shows that the above integrals are 
continuous in for every T > 0. Hence, due to uniform convergence, the limits and ^ 

are continuous in 


S-2 Proof of Lemma 3 of MB 


The proofs of (i) - (iv) follow from (Hi"), (H4"), the results (3.10) and (3.11) of MB following from 
(H5"), (3.1) and its asymptotic form (3.2) (with k = 2), using the relation (2.8) of MB. To prove (v), 
note that, since for any A: > 1, it holds, due to (H4"), (3.1) of MB and boundedness of . on [0, T], that 
for j = 0,1, 


E 




a{s,X{s)) 


2k 


ds < oo, 


it follows from Theorem 7.1 of |Mao| ( |2011| l, page 39, that 


E 


lo cr{s,X{s)) 


dW{s] 


2k 


< {k{2k-l)fT'^-^E 


hAs'^^l3As,X{s)) 


a{s,X{s)) 


2k 


ds. 

(S-2.1) 


Hence, using Chebychev’s inequality, it follows that for any e > 0, 
1 </'S.('S)b/3,(s,2f(s)) 


p 


1 

Jo 


a{s,X{s)) 


-dW{s) 


> € 


< e 


-2k 


{k{2k - 1))’^ E 




a{s,X{s)) 


2k 


ds. 


(S-2.2) 


In particular, if A; = 2 is chosen, then it follows from the above inequality, (H4") , (3.2) of MB, and 
boundedness of on [0, T], that 


T=1 


proving that 


1 hjA)b^^{s,X{s)) 

rio tT(s,x(s)) 

1 </>$,('S)h/3.(s,X(s)) 


dW{s] 


TJo ^is,X{s)) 


dW{.s) 


> e < oo, 


0 . 
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To prove (vi), first note that 



1 r 

'(PlXs)blXs,X{s)) 


E 

tL 


ds 


2 k 


< T-^E 


< KiT-'^E 




bl{s,X{s)) 


cr2(s,X(s)) 


- KjiPj) 


bl(s,X{s)) 


cr2(s,X(s)) 


- KjiPj) 


2 k 


ds 


ds, 


(S-2.3) 


for some finite constant K 4 > 0. The second last inequality is by Holder’s inequality, and the last in¬ 
equality holds because (j)^. (t) is uniformly bounded on [0, 00 ] thanks to compactness of Z and continuity 
of the functions gi]l = I,... ,p. Hence, for any e > 0, 


P 


1 ^^is)bl^is,X{.s)) 

T Jo a‘^{s,X{s)) ^ 




< K^e-'^^P-^E 



blXs,X{.s)) 

a^{s,X{s)) 




2 k 


ds. 


In the same way as the proof of (v), it follows, using the above inequality, (3.3) and (3.2) of MB, that 


That is. 


00 




f 1 4>lXs)blis,X{s)) 
\T Jo o-2(s,X(s)) 

1 fT (t>lXs)b% Xs,X{.s)) 
T Jo a‘^{s,X{s)) 


1 r'^ 

kijifdj)- J (?i|.(s)ds 


> e 


1 

Kj{l3j)^ J 4>l.{s)ds 0 , 


< 00 . 


almost surely, as T —)■ 00 . Since, as T —)■ 00 , 7 ^ (s)ds —)• by (3.10) of MB, the result 

follows. Using (3.4) instead of (3.3), (vii) can be proved in the same way as (vi). The proofs of (viii) 
and (ix) follow from (v), (vi) and (vii), using the relation (2.8). 


S-3 Proof of Lemma 4 of MB 


Using the Cauchy-Schwartz inequality twice we obtain 

1 r P ( ^ioX^)hiX^)bf 3 o{s, X{s))b, 3 ^{s, X{.s))' 

tJo V o^{s,X{s)) 
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Taking the limit of both sides of (S-3.11 as T —oo, using (ii) of Lemma 3 and the limits (3.10), the 
result follows. 


S-4 Proof of Lemma 5 of MB 

For any h G (0, t), 


It 


/c 






TT{d6i) 




/e exp (^iUe„t - Ue„t) - vr(d0i) 

/©exp - U0^,t-h) - 7r(d0i) 

f ( (TT TT \ iV01,t-h - Veo,t-h)' 

= exp ( - Ue^^t-h) -^- 


exp ( U0^^t-h,t - UeQ,t-h,t - h,t Vg^^t h,t) j 
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T,_ 


t—h 
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exp 


rt <i>io{s)bpQ{s,X{s)) , , . 1 nt 

H-h 0-2(5, X(s)) 2 h-h o-2(5,X(s')') 


0-2(5,X(s)) 


fOQ,t—h,t{,Xt—h,t) 

Hence, the result holds. 


(S-4.1) 


S-5 Proof of Theorem 6 of MB 


Let us consider 


n{T)-l 

STq„ = Tqn ^ (4rg„ +^rTg„) > 

r=0 


(S-5.1) 


where where, given T > 0, n{T) is the number of intervals partitioning [0, T] each of length 

-X. We assume that as T —)• oo, —)• 0. 

n(T) n{T) 

It follows, using (H9"), that for any T > 0, 


E 



1 d 1 

Tf I -j,Eeo{logIt)dt + - I Eoo 


= E, 


0 dt 
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X'tifooJ) 


dt 


Oo 


^log/T 1 + ^ 
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E, 


Oo 


iC'AfeoJ) dt, 


(S-5.2) 
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as n{T) —)■ oo, for any given T > 0. Also, since due to (4.26) of MB, E ( \^rTq„ ) = 0, 


we must have E ( 


= E 


= 0 for any T, n(T). Thus, it follows from (S-5.2), that 


^ ( ■^rTq^ + KTqJ^rTq^ ) = 0, for any r, T, n(T). Hence, 


1 


-^00 i 7^ ^og It )+- 


1 


E, 


00 


dt ^ 0, asT—)-cx). 


We now deal with the second term of the left hand side of (S-5.31. Since, by (H6 ), 


vr (^01 : inf = 1, 

it holds that > 5 for all t with probability 1, so that 

K{fe,J) = }C'{feoJM 5 )), 

where At{6) is given by (4.19) of MB. The Q* property implies that 


1 r 

liminf — / Ea„ 
T T Jo 


f^tifeoJ) 


dt > 6. 


The results (S-5.31 and ( S-5.4| ) imply that 


lim^sup Ee^ ^ log -^T ) < S. 


Now observe that 


-^T = y exp(C/0j,T — Ubq^t) X exp |“2 '^{dOi) 

UquT U0q,t' 


> / exp T 
JNo{c) 


X exp 


T j, 

T fVe^^T V0o,T 


7r(d0i), 


where c > 0, and 


Moic) = {ei€Q-.6<K^ {fe„ f0,)<S + c} 


= 5 < 


-|^«:o(/ 3 o) + ^«:i(/ 3 i) <d + c 


2 


(S-5.3) 


(S-5.4) 


(S-5.5) 


(S-5.6) 


the second line following from (4.15) of MB. Using Jensen’s inequality, we obtain 

'C^0i,T 


1= log {It) > [ 
^ Ja 


A/b(c) . 


1 / Vgi.T _ V0o,T 

2 I ¥~ 


7 r{d 0 i). 


(S-5.7) 


By (vi) - (ix) of Lemma 3 of MB, the integrand of the right hand side of the above inequality, which we 

7(2) t{2) 1 

- 4>f^^^^K{Po, (3^) + -^ko{(3q) ,pointwise 


denote by converges to g{6i) = — 


for every 6i, given any path of the process X in the complement of the null set. Due to (HI"), (H4") 
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and (3.2) of MB, sup Eq^ < oo, so that is uniformly integrable. Hence, 

T 



gxT{^i)T^{dOi) 



5(0i)7r(d0i), 


given any path of the process X in the complement of the null set. Let us denote the left hand side of 
the above by Lfxy let H denote the right hand side. We just proved that Hx^ converges to H almost 
surely. Now observe that 


sup Eoo [Hxt? 
T 


sup Ee^ 
T 



2 


gxT{^i)T^{dei) 


< [ sup E 0 ^[gxrridi)f T^{dOi). 
JJVoic) T 


Again, due to (Hi"), (H4") and (3.2) of MB, the last expression is finite, proving uniform integrability 
of Hence, 

^lim Ee, {Hxr) = Eg, {H) = H. 

It follows that 


liminf Eg„ 
T^oo 


^ log (It) 


> liminf Eg^ {Hxt) 


Eeo (H) 



p(0i)7r(0i)d0i 


> - ((5 + c) TT (A/'o(c)) 

> - ((5 + c). 


(S-5.8) 


Since the above holds for arbitrary c > 0, it holds that 


liminf Eg„ ( — log Jr 1 > — (5- 


T^oo 


T 


Thus (S-5.51 and (S-5.91 together help us conclude that 


Eg ( — log It 1 —> —S. 


(S-5.9) 


as T —)> oo. 

We now show that the variance of log Jr is 0(1), as T —)• oo. First note, due to compactness of 
0 , the mean value theorem for integrals ensure existence of 6i = = Oi{W) G 0 , depending 

on the Wiener process W such that 

logI t = “ ^ 00 ,t) - ^ (^ 0 i,r “ • (S-5.10) 

Now note that the results presented in Lemma 3 of MB continue to hold even when 6 i is replaced 
with 01. Specifically, fhe following hold in addifion fo fhe resulfs of Lemma 3: 
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It follows that, as T —oo, 


^ log/T - ^ 0. 


By uniform integrability arguments, which follow in similar lines as the proofs of Lemmas 1 and 10 of 
Maitra and Bhattacharya| (2016al using (H4"), compactness, and Cauchy-Schwartz, it holds that 

Var 0 Q (Et) —)• 0, as T —)• oo. (S-5.11) 

Hence, 

Vare^ log/r^ = Vare^ (^£t + 

= Vare^ {It) + Vare^ /3i) - 

+ 2Coveo k(/3o,/3i) - - J4!^ko(/3o) 


2"ii 




(S-5.12) 

By (S-5.11|), the first term of (S-5.12i goes to zero as T —oo, and the third, covariance term tends to 


zero by Cauchy-Schwartz and (S-5.111. In other words, as T —)■ oo. 


However, 


Vare^ { ^log/r^ - Varg^ 


VaroQ { ki (/ 3 i ) ) 0 , 


0 . 


(S-5.13) 


unless k(/3o,/3i) — ^cjy- Ki{fii) is constant almost surely. It then follows from (|S-5.13|l that 


$o4i 


Varoo loglT^ = 0(1), as T oo. 


(S-5.14) 


S-6 Proof of Theorem 8 of MB 

In our set-up it follows from (2.7) of MB that 


1 - 11 
^^OgIn,T = -Y.^\ogh,T. 


n ^^ T 

2=1 


(S-6.1) 


In the iid case, given T > 0, using the above form, it follows by the strong law of large numbers, that 

(S-6.2) 

almost surely. Now, in the iid situation, for each i, Eq^ {^ log/j^T) —^ —<5, as T —)• oo. Hence, taking 


lim log in,T = E 0 g(^ log h^T ) , 
n^oonl Vi 


limit as T ^ oo on both sides of (S-6.2 1 yields 


lim lim i- log In,T = lim ( i log /i,r 1 = -<5, 

T—>cxDn—^ooni T—>-oo \1 ' 


(S-6.3) 


almost surely, proving the theorem. 
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S-7 Proof of Lemma 10 of MB 


Note that, due to compactness of X and Z and continuity of the covariates in time t, there exists x* G X 
and 2 ;* G 3, such that 


sup 

xex,ze3 


^^00 (log4,T,z) + '5(x,z) 


{\ogh.^T,z*) + 5{x*,z*) 


0 , 


(S-7.1) 


as T —)• 00 , where the convergence is due to (6.2) of MB. Also, ^Eq^ (log/ 3 ;,r,z) is clearly continuous 
in (x, z) for every T > 0 (the proof of this follows in the same way as that of Theorem 5 of Maitra an'd| 


Bhattacharya (2016c I). Combining this with the uniform convergence (S-7.11 it follows that 5{x, z) is 


also continuous in (x, z). 


S-8 Proof of Lemma 11 of MB 

Note that the limit (6.3) of MB can be represented as 



(S-8.1) 


where : [0,1] i—is some continuous function satisfying ^5 (^) = 5 (x^"''^, z^+i) for r = 
0,..., n — 1. For the remaining points y e [0,1], we set g 5 {y) = <5(x,z), where (x,z) G X x 3 
is such that Q 5 {y) is continuous in y G [0,1]. Since 6 {x,z) is continuous in (x,z), Qs{y) can be 
thus constructed. Note that, it is possible to relate y G [0,1] to (x, z) G X X 3 by some continuous 
map ping G : X x 3 ^ [0; 1]> taking (x, z) to y. Thus, in (6.3) of MB is the limit of the Riemann 

sum (S-8.l|l associated with the continuous function gs', the limit is given by the integral g 5 {y)dy. 
Since the domain of integration is [0,1], it follows, using continuity of gs, that the integral is finite. 
Observe that for any given sequence {(x*, Zj)}^_]^, one can construct a continuous function gs such that 
6 °° = fg gs{y)dy. In other words, S°° exists for all sequences {(x*, Zj)}“^. 


S-9 Proof of Theorem 12 of MB 


For given T > 0, it follows from (S-5.14i, compactness of 0, X, Z, and continuity of the relevant 


functions bp., gi,... ,gp, kq, ki and k, that 


sup Var 0 Q — log/a,,T,z < oo. 

xex,z£3 / 


(S-9.1) 


Hence, given T > 0, 
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log/i,T) 


< oo. 


It then follows due to Kolmogorov’s strong law of large numbers for independent random variables that 



(S-9.2) 


almost surely. 
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Now observe that the right hand side of (S-9.2i admits the following representation 


n—1 


lim — 

n^oo 71 


-E 

n / ^ 


r=0 


H-.T 
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(S-9.3) 


some continuous function depending upon T with g{^,T) = Eq^ (^ log Ir+i,T )■ 


where q{-,T) : [0,1] i->- M is 

Since log Ix,T,z) is continuous in (x, z), g{y, T) can be constructed as in Lemma 11 of MB. 

Then, for almost all y G [0,1], g{y, T) —>• —5{x, 2 :) as T —)• 00 , for appropriate (x, z) associated with y 


via y = G(x, z) as in Lemma 11. Also, it follows from (S-S.lOl that g{-,T) so constructed is uniformly 


bounded in T > 0. Thus, the conditions of the dominated convergence theorem are satisfied. 
Since (S-9.31 is nothing but the Riemann sum associated with p(-), it follows that 


lim -^Eog (^EogliE = / g{y,T)dy. 
n \T J Jo 


(S-9.4) 


By construction of g{y, T), the dominated convergence theorem holds for the right hand side of (S-9.4 1 . 
Hence, 


lim lim - V L;0EEogE = lim [ g{y,T)dy = [ 
T^oon^oo n J T^oo Jq Jq 


lim g{y,T)dy 


T^OO 


1 / 1 \ 1 ^ 
lim — lim Eg„ I — log/* t I = — lim — d(x^, z 


n^oo n T^oo 
2=1 


= -S°°. 


2=1 


(S-9.5) 


Combining (S-9.51 with (S-9.2 1 it follows that 


1 f 1 \ 1 .^ / 1 

lim lim - V log/i.T = Jim lim - 1 = 

T^oon^oo 71 \ I ] T^oon^oo 71 \ 1 

2=1 


T^oon^oo n 


2=1 


(S-9.6) 


almost surely. 
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